For demonstration purpose, we include the extracted latent representations of the base uncased-Bert model under LWP attack for SST2 dataset in clean_features.csv and bd_features.csv respectively.


To reproduce the Figure 4(c), run the following command from terminal with appropriate environment: python lwp_auc_plot.py

Required Packages:
numpy
random
sklearn
statsmodels
Matplotlib

For all other latent representations, please following the instructions mentioned in the appendix pdf to obtain them. To reproduce all the AUCROC plots in the main text, you need to replace the corresponding latent features in 'lwp_auc_plot.py'.

----------------------------------

The entire code, as well as all the extracted latent representations, will be released after acceptance.